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ABSTRACT 



The Sloan Digital Sky Survey has validated and made publicly available its First 
Data Release. This consists of 2099 square degrees of five-band (ugriz) imaging data, 
186,240 spectra of galaxies, quasars, stars and calibrating blank sky patches selected over 
1360 square degrees of this area, and tables of measured parameters from these data. 
The imaging data go to a depth of r « 22.6 and are photometrically and astrometrically 
calibrated to 2% rms and 100 milli-arcsec rms per coordinate, respectively. The spectra 
cover the range 3800-9200A, with a resolution of 1800-2100. Further characteristics of 
the data are described, as are the data products themselves. 

Subject headings: Atlases — Catalogs — Surveys 
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1. Introduction 

The Sloan Digital Sky Survey (SDSS) is a photometric and spectroscopic survey, using a 
dedicated 2.5-m telescope at Apache Point Observatory in New Mexico, of many thousands of 
square degrees of high Galactic latitude sky. The scientific goals that define the scope of the project 
(York et al. 2000) relate to large-scale structure seen in the distribution of galaxies and quasars. In 
addition to addressing these issues, the survey data products are proving valuable for many other 
astronomical problems, from asteroids to Galactic structure, from rare types of white dwarf stars 
to the highest-redshift quasars. The validated data are being released at approximately annual 
intervals. Each release includes sufficient information to allow statistical analysis, e.g. measures of 
data quality and the completeness of the source lists. The first SDSS Data Release (DR1) amounts 
to about 20% of the total SDSS survey goal. 

In Summer 2001, the SDSS released the results of observations obtained during the com- 
missioning phase of the SDSS; this Early Data Release (EDR) is described by Stoughton et al. 
(2002), which contains extensive information on the SDSS data and data processing software. A 
similarly comprehensive description of the DR1 data and derived parameters may be found at 
http://www.sdss.org/drl (hereafter the web site). The purpose of the present paper is to for- 
mally mark the first SDSS data release and to provide a quick guide to the contents of the web site. 
The sky coverage of the imaging and spectroscopic components of the DR1 are shown in Figure 1. 

2. Published documentation 

A number of papers have been published that provide important technical background relevant, 
but not limited, to DR1. In this section we review these publications. 

A technical summary of the project is given by York et al. (2000). This is an introduction to 
extensive on-line discussion of the hardware (the Project Book), found at 

http://astro.princeton.edu/PBDDK/welcome.htm. The imaging camera is described by Gunn 
et al. (1998). 

The Early Data Release is described by Stoughton et al. (2002), which includes an extensive 
discussion of the data outputs and software. More details of the photometric pipeline may be found 
in Lupton et al. (2001). 

Strauss et al. (2002) give the target-selection procedures for the main galaxy sample of the 
SDSS. This paper provides the basis by which one can construct a statistically complete sample of 
galaxies with spectra. Eisenstein et al. (2001) describe the procedure for targeting a magnitude- 
and color-selected sample of Luminous Red Galaxies (LRGs) at redshifts up to z = 0.55. The 
redshift histograms of the objects from these two samples in the DR1 are given in Figure 2. 

Richards et al. (2002) present the algorithm that is currently being used to target quasars from 



-5 - 



300 




6^*-»---au-, /— «oo 



—60 



300 




M»£U « /»'">800 



60 



Fig. 1. — The distribution on the sky of the imaging scans and spectroscopic plates included in 
the DR1. This is an Aitoff projection in equatorial coordinates. The total sky area covered by the 
imaging is 2099 square degrees, and by the spectroscopy is 1360 square degrees. 



SDSS photometry, although the DR1 sample (like the EDR sample) uses a more hetereogeneous 
set of algorithms since the DR1 data predate the implementation of this specific algorithm; see 
Schneider et al. (2003) for more details and a formal catalog of DR1 quasars. The redshift histogram 
of spectroscopically confirmed quasars in the DR1 is given in Figure 3. 

These spectroscopic samples are assigned plates and fibers using an algorithm described by 
Blanton et al. (2003). 

Pier et al. (2003) describe the methods and algorithms involved in the astrometric calibration 
of the survey, and present a detailed analysis of the accuracy achieved. 

The network of primary photometric standard stars is described by Smith et al. (2002). The 
photometric system itself is described by Fukugita et al. (1996), and the system which monitors 
the site photometricity is described by Hogg et al. (2001). 
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Fig. 2. — Redshift histogram for objects spectroscopically classified as galaxies in DR1. The 
curve labeled "Main Galaxies" is the flux-limited sample, containing 113,199 galaxies. The curve 
labeled "LRGs" is a color-selected sample designed to contain intrinsically luminous, red galaxies, 
containing 15,921 galaxies. 



The official (IAU designation) SDSS naming convention for an object is SDSS JHHMMSS.ssiDDMMSS.s, 
where the coordinates are truncated, not rounded. This format should be used at least once for 
every object listed in a paper using SDSS data. 



3. Contents of DR1 

The imaging portion of DR1 comprises 2099 square degrees of sky imaged in five wavebands 
(u,g,r,i, and z), containing photometric parameters of 53 million unique objects. Within this 
area, DR1 includes spectroscopic data (spectra and quantities derived therefrom) for photomet- 
rically defined samples of quasars and galaxies, as well as incomplete samples of stars. The 
spectroscopic data cover 1360 square degrees. The details of the sky coverage can be found at 
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Fig. 3. — Redshift histogram for objects spectroscopically classified as quasars in DR1 (and with 
luminosities Mb < —22), including 16,847 objects. The catalog of bona fide quasars in DR1 is 
presented in Schneider et al. (2003). 

http : / / www . sdss . org/ dr 1/ coverage/. 

SDSS collects imaging data in strips which follow great circles. Two interleaving strips together 
make up a stripe 2.5 degrees wide; at the equator of the system of great circles, stripes are separated 
by 2.5 degrees. A continuous scan of a piece of a strip on a particular night is called a run; this 
is the natural unit of imaging data. Data from 62 runs are included in DR1. The DR1 footprint 
is defined by all non-repeating survey-quality runs within the a priori defined elliptical survey area 
(York et al. 2000) obtained prior to 1 July 2001; in fact, 34 square degrees of DR1 imaging data 
lie outside this ellipse. While the DR1 scans do not repeat a given area of sky, they do overlap to 
some extent, and the data in the overlaps are included in DR1 as well. 

Spectroscopy is undertaken with guided exposures of overlapping tiles ("plates"), each 3 degrees 
in diameter. For each plate, 640 spectroscopic fibers are available. DR1 consists of 291 plates, the 
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centers of which lie within the boundaries defined by the DR1 imaging footprint. 

The surface density of spectroscopic targets per square degree consists, on average, of roughly 
90 galaxies in a flux-limited sample, an additional 12 galaxies of a flux- and color-limited sample of 
Luminous Red Galaxies (LRGs), and 18 quasar candidates. Each plate is assigned 18 calibration 
stars and 32 fibers on blank sky for sky subtraction. Finally, extra fibers available in a given 
region of sky are assigned to objects matching ROSAT (X-ray; Voges et al. 1999) and FIRST 
(radio; Becker, White, & Helfand 1995) sources, as well as unusual stars of various types. The 
spectroscopic targeting of all of these samples is based on the photometric quantities produced by 
the SDSS pipelines. 

DR1 includes the footprint of the sky already released in the EDR. All of the EDR data have 
been reprocessed with the latest versions of the SDSS software. In some parts of the sky, better 
data (both imaging and spectroscopy) have been substituted for the older, commissioning data of 
the EDR. 

The data products in DR1 include the following. 

• Images: The "corrected frames" (flat-fielded, sky-subtracted, and calibrated sub-images 
corrected for bad columns, bleed trails and cosmic rays, each 13.6' x 9') in five bands, available 
in both fits and jpeg format; a mask file that records how each pixel was used in the imaging 
data-processing pipelines; 4 x 4-binned images (i.e., with 1.6" pixels) of the corrected frames 
after detected objects have been removed; and "atlas images" (cutouts from the corrected 
frames of each detected object). 

• Image parameters: The positions, fluxes, shapes, and errors thereof for all detected objects 
in the images, as well as information about how these objects were spectroscopically targeted. 

• Spectra: The flux- and wavelength-calibrated, sky-subtracted spectra, with error and mask 
arrays, and the resolution of the spectra as a function of wavelength. 

• Spectroscopic parameters: The redshift and spectral classification of each object with a 
spectrum, as well as the properties of detected emission lines and various further spectral 
indices. 

• Other data products: Astrometric and photometric calibration files, the point spread 
function of the images, gif and postscript plots of spectra, and "finding charts" (cutouts of 
the survey image area according to specified limits in right ascension and declination) in a 
number of formats. 

These DR1 data products are available at the web site which includes detailed description of 
the data, and documentation of the access tools. 
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4. Changes in DR1 with respect to the Early Data Release 

The description of the SDSS data, file structures, and processing pipelines presented in Stoughton 
et al. (2002) remains an essential point of departure for understanding and using the data products 
in the First Data Release. However, there have been some significant changes in the data processing 
since the EDR; we comment briefly here on some of the more important ones. 

• The photometric equations have been reformulated to be in the natural system of the 2.5-m 
telescope, making the relation between measured counts and magnitude a simple one. The 
mean colors of stars on the old and new systems have been forced to be the same. The changes 
from previously published photometry due to this are subtle, typically no more than few hun- 
dredths of a magnitude. To distinguish between photometric systems, the new one (u, g, r, i, z) 
is unadorned, whereas the EDR system was designated with asterisks (u* g* r* i* z*). The 
prime system discussed by Fukugita et al. (1996; u' g' r' i' z') now refers only to the native 
system of the US Naval Observatory Flagstaff Station 1-meter telescope (cf., Smith et al. 
2002), and should not be used in referring to the data from the 2.5-m. As before, all magni- 
tude zero points are approximately (i.e., within 10%) on an AB system. The magnitude scale 
is not exactly logarithmic, but uses an asinh scaling (Lupton, Gunn & Szalay 1999; see the 
web site for further details). Surface brightnesses, however, are reported on a linear flux scale 
of "maggies" ; one maggie corresponds to the surface brightness of a zeroth magnitude object 
in one square arcsecond. A surface brightness 20th mag in one square arcsecond is therefore 
10~ 8 maggies. 

• In the EDR, scattered light produced systematic errors in the derived flat field, and therefore 
in the photometry, especially in the u band. The imaging flat fields have now been corrected 
for this effect, reducing a major source of systematic error. 

• The EDR version of the photometric pipelines had difficulty following rapid variations in the 
point-spread function. The DR1 code is more robust to this problem, and has greatly reduced 
the effects of variable seeing on the photometric measurements. 

• There is a small but measurable non-linearity in the response of the photometric CCDs, 
measuring several percent at saturation. This effect has been corrected in the DR1 processing. 

• The EDR image deblender often shredded galaxies with substructure into several individual 
objects, especially for objects brighter than r ~ 15 mag. This behavior has been suppressed, 
and the vast majority of bright galaxies are now treated properly by the deblender. 

• In addition to the object shape measures of the EDR, the photometric pipeline now calculates 
so-called adaptive moments (cf., Bernstein & Jarvis 2002) that are designed for weak-lensing 
measurements of faint objects. 



Cosmic rays are recognized as such by their sharp gradients relative to the point-spread 
function. An enhanced routine described in Fan et al. (2001) is now implemented as part of 
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the pipeline. This routine sets a flag, MAYBE_CR, which is valuable for assessing the reality of 
objects detected in only a single band. 

• In the EDR, the exponential (Freeman 1970) and de Vaucouleurs (1948) profile models for 
galaxy images were fit only to the central 3 arcsec radius of each object. This procedure tended 
to give misleading results for galaxies with large angular extent. The DR1 version of the code 
does a much more reasonable fit to large galaxies. However, an error was found following 
the completion of DR1 processing, which causes the model magnitudes to be systematically 
under-estimated by 0.2 magnitudes (i.e., the model magnitudes are too bright) for galaxies 
brighter than 20th magnitude. Similarly, the measured radii are systematically too large. This 
error will be corrected in future releases. Note that this error only affects model magnitudes 
of galaxies; all other photometry is unaffected by this error. In addition, model colors are 
essentially unchanged. 

• The astrometric pipeline now uses centroids corrected for asymmetries in the point-spread 
function, and includes a better treatment of chromatic aberration. 

• The spectroscopic pipeline has much improved flat-fielding, bias subtraction, and handling of 
bad columns and pixels. Sky subtraction has been improved, especially in the red, by allowing 
for the gradient in the sky brightness across a spectroscopic plate. The spectrophotometric 
flux calibration is improved as well, as is the correction for absorption lines from the Earth's 
atmosphere. 

• There have been upgrades to the continuum and line-fitting routines in the spectroscopic 
pipeline. More extensive stellar templates have increased the accuracy of the classification of 
unusual types of stars. 

• The galaxy spectral-classification eigentemplates for DR1 are created from a much larger 
sample of spectra (200,000) than were used for the EDR. 



5. Data quality 

5.1. Quality of Imaging Data 

The imaging survey is undertaken in photometric conditions (as determined by an all-sky 
10-micron camera) with no moon. We also impose a nominal limit on the effective width of the 
point-spread function of 1.7 arcsec in the r filter. This width is the full width at half maximum of 
the Gaussian with effective area equal to that of the actual point spread function (PSF) at the center 
of each frame; it is therefore somewhat larger than the actual full width at half maximum of the 
PSF, due to the presence of extended low-amplitude wings on the PSF. In the off-line processing, 
data are declared not to be of survey quality if the width of the point-spread function exceeds this 
value for an interval longer than about 10 minutes, or if the point-spread function is seen to be 



- 11 - 



rapidly varying, in which case an attempt is made to scan that interval again. The upper panel 
of Figure 4 shows the cumulative distribution of the width of the point-spread function in DR1 as 
determined on a frame-by-frame basis; only a very small fraction of the data in DR1 exceeds the 
seeing threshold. The five filters yield different distributions both because of the dependence of 
seeing on wavelength, and because the separate filters sample distinct regions of the focal plane. 

The lower panel of Figure 4 shows the distribution of sky brightness values in DR1, averag- 
ing over each frame. This value has been corrected for atmospheric extinction to zero airmass, 
and therefore is biased to higher brightness, by 0.65 mags in u, but only 0.08 mag in z. The sky 
brightness, together with the instrumental throughput and the atmospheric extinction and air- 
mass, allow one to compute the expected signal-to-noise ratio for the image of an object of known 
brightness and profile; the sensitivity curves for the five bands are available on the web site; cf., 
http : //www . sdss . org/drl/instruments/ imager/. 

The photometric zero point is transferred from stars calibrated for this purpose using an 
auxiliary Photometric Telescope, in 27 x 27 arcmin regions distributed approximately every 15 
degrees along each stripe. A number of tests allow us to quantify the uniformity of the photometric 
zero points and the accuracy of the calibrations: 

• Repeatability of photometry in regions of sky in which runs overlap (cf., Ivezic et al. 2003); 

• Constancy of the locus of stars in color-color space; 

• Lack of structure in the stellar or galaxy distribution on the sky correlated with run geometry, 
seeing, foreground reddening, and sky brightness; 

• Comparison of SDSS photometry with externally calibrated standard star fields. 

From these results, the photometric zero point varies across the DR1 footprint by less than 0.02 
mag rms in the r band, 0.02 mag rms in the colors (g — r) and (r — i), and 0.03 mag rms in the 
colors (u — g) and (i — z). 

We can similarly check the astrometric precision; repeat scans confirm that our rms errors are 
rarely worse than 100 milli-arcsec (mas) per coordinate; a more typical number is 60 mas. See Pier 
et al. (2003) for an extensive discussion of the astrometric accuracy. 

The depth of the imaging data is a function of sky brightness and seeing, but comparisons with 
deeper fields from the COMBO-17 survey (Wolf et al. 2003) give 50% completeness for stellar sources 
at (u,g,r,i,z) = (22.5,23.2,22.6,21.9,20.8) under typical conditions. Star-galaxy separation is 
better than 90% reliable to r = 21.6. 
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5.2. Quality of Spectroscopic Data 

The spectroscopic survey is undertaken in observing conditions that are not photometric, or 
with seeing worse than 1.7 arcsec FWHM, or with some moonlight. A spectroscopic observation is 
declared to be of survey quality when the mean square of the signal-to-noise ratio per pixel over the 
spectrum is greater than 15 for objects with fiber magnitudes of g = 20.2, r = 20.25, and i = 19.9; 
see the web site for more details. Figure 5 presents the distribution of the square of the signal-to- 
noise ratio per pixel for the 291 plates in DR1, for objects with a fiber magnitude r = 20.25. All 
the plates clearly exceed the threshold of (S/N) 2 = 15; note the presence of several plates with 
signal-to-noise ratio exceeding the minimum requirements by factors approaching 3, as some plates 
were observed for substantially longer times. The sky subtraction is sufficiently accurate that the 
noise is close to the photon shot noise. 

The FWHM of an unresolved emission line in the spectra is typically 2.5 pixels (1 pixel ~ 65 
kms -1 ). From repeat observations of galaxies near the survey limit, the redshift accuracy is known 
to be better than 30 kms -1 ; for bright stars, the redshift accuracy may be better than 10 kms -1 . 

The redshifts and classifications have been checked and updated by comparing results from 
two independent codes. Roughly 1% of the spectra (other than the 32 sky spectra per plate) are 
of low enough signal-to-noise ratio as to be unclassifiable; of the remaining, the error rate is below 
half a percent. 

Data quality also depends on the precision and uniformity with which classes of spectroscopic 
targets have been selected and observed. The user should be aware that the selection criteria 
for galaxies and quasars have has ranged from 17.5 to 17.77 in extinction-corrected Petrosian r 
magnitude through the period covered by DR1. In particular, the magnitude limit for the main 
galaxy sample did move about somewhat during the commissioning phase of the SDSS. Similarly, 
the quasar target selection algorithm described in Richards et al. (2002) is a modification of that 
used in DR1; the newer version is more complete in high-redshift quasars. See the DR1 web site 
and the papers cited in Section 2 for more details. 

As the name implies, DR1 is the first of a series of releases of what will eventually be the 
entire Sloan Digital Sky Survey. The second data release, DR2, is planned for early 2004. DR2 will 
increase the total amount of data by 50% with respect to DR1, and it will include a reprocessing 
of DR1 which will fix the model magnitude bug mentioned in § 4. 

Funding for the creation and distribution of the SDSS Archive has been provided by the 
Alfred P. Sloan Foundation, the Participating Institutions, the National Aeronautics and Space 
Administration, the National Science Foundation, the U.S. Department of Energy, the Japanese 
Monbukagakusho, and the Max Planck Society. The SDSS Web site is http://www.sdss.org/. 

The SDSS is managed by the Astrophysical Research Consortium (ARC) for the Participating 
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Institutions. The Participating Institutions are The University of Chicago, Fermilab, the Institute 
for Advanced Study, the Japan Participation Group, The Johns Hopkins University, Los Alamos 
National Laboratory, the Max-Planck-Institute for Astronomy (MPIA), the Max-Planck-Institute 
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Fig. 4. — Top panel: Cumulative distribution of values of the seeing in arcsec, for all frames in DR1, 
in each of the five filters. Note that over 90% of the survey data meet the nominal specification of 
seeing better than 1.7 arcsec in r. Bottom panel: Cumulative distribution of the values of the sky 
brightness in units of magnitudes per square arcsec for all frames in DR1, as measured in the five 
filters. 
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Fig. 5. — Distribution of the values of the square of the spectroscopic signal-to-noise ratio per pixel 
for an object with r = 20.25 through a fiber, for the 291 plates in DR1. The two spectrographs 
(each with 320 fibers) are shown separately. All the DR1 plates have (S/N) 2 > 20; note the presence 
of a number of plates with signal to noise ratio much higher than the minimum value. 



